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ABSTRACT: We present an algorithm for constructing an optimal slate of sponsored search ad- 
vertisements which respects the ordering that is the outcome of a generalized second price auction, 
but which must also accommodate complicating factors such as overall budget constraints. The 
algorithm is easily fast enough to use on the fly for typical problem sizes, or as a subroutine in an 
overall optimization. 
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1. Introduction 

In this paper we consider the problem of constructing a maximum utility "slate" of ads for 
display in response to either a search term, or content page requesting ads from a server. We shall 
treat both of these essentially the same — that is that some ordered set of ads (the "slate") is to be 
returned in response to a "query". 

Let us denote the n bidders on this query by j = 1, . . . , n, and suppose that some form of 
Generalized Second Price auction (GSP)[0] has been carried out, which induces a ranking of the 
bidders. For simplicity, let the numerically ordered indices also indicate the bid ranking, initially 
assumed to be determined solely by the bids — sometimes called the Overture rankingQ That is, if 
we denote the bid of bidder j by Aj, then 

Ai > A 2 ... > An. 

Let m be the maximum number of positions, and let Tj p be the click-through rate (CTR) of bidder 
j when his ad is at position p. Finally, let pj be a "utility factor" associated with the appearance of 
bidder j's ad in response to the query. 

We define a slate as an ordered subset S = {j±, . . . , jk}, where k < m, of the ordered set of 
ads {1, . . . , n}. Since we are initially assuming a second-price auction by bid value, bidder j p in 
position p pays the bid Aj +1 of the bidder occupying position p+ 1. In addition there is a minimum 
bid e, which is paid by the last bidder in the slate if and only if there are no lower bidders on the 
query. Under these assumptions we wish to solve 

k 

m | x U = Pj P T j P P A j P +i (!) 
p=i 

subject to the requirement that j\ , . . . , is an increasing set of indices, and k < m. 

Assuming the CTRs are independent, the utility U of the slate when the pj are all unity is easily 
seen to be the expected revenue from the slate. However there may be several reasons why we wish 
to consider other values of the pj. Some of the more important of these are: 

1. The advertiser placing the ad may be at, or near, its budget, thus reducing the desirability 
of showing it. This is our primary motivation, and this is a companion paper to [^J, which 
discusses this in detail. For convenience, we include a brief description of this model in an 
Appendix. 

2. "Ad Fatigue" induced by too-frequent showing of an ad may reduce its effectiveness. We 
might therefore wish to penalize some ads which have been shown above some threshold. 

3. We may be given (or wish to have) the CTRs as a product two components - a component 
solely due to the ad, independent of position, and a position-only dependent component. In 
this case we may reduce Tj p to a position only component T p and an ad-dependent component 
which becomes a contributor to pj. However, this is not a necessary feature and we will 
usually continue refer to the CTRs as Tj p . 

'This assumption will later be generalized. 
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The treatment we give here is independent of the source of the pj. 

There is some superficial similarity with the well-known knapsack problem^] — we are select- 
ing a subset S of up to m items (the ads), subject to constraints. Also related is the "knapsack 
auction" considered in [J3j] . Even more strongly related is the cardinality constrained knapsack 
problem^], since our slates have a fixed maximum size. However, the ordering requirement makes 
the problem more specialized. Nevertheless, like the knapsack problem, our problem is amenable 
to a dynamic programming approach. 

2. Backward Recursion 

We begin by giving a 0(n 2 m) algorithm for solving ([]]) using a dynamic programming (DP) 
algorithm with backward recursion[||]. For this approach, it is convenient to include m dummy 
bidders, call them n + 1, . . . , n + m, with pj = and Aj = e for j = n + 1, . . . , n + m. The 
reason for doing this is that we only need to consider slates of size equal to m; smaller slates can be 
padded with dummy bidders at the end to produce the same effect and make a slate of size exactly 
m. 

Take one j and one s such that 1 < s < m. Let us define a subslate starting from j at position 
s as a set of increasing indices, S = {j s ,j s+ i, . . . ,j m } such that j s = j. Let S(s,j) denote the set 
of all such subslates. Consider the problem of computing the best "marginal-revenue-to-go": 

F(j,s)= _max ^ Pj p T jpP A jp+1 (2) 
ses(s,j) p=s 

Suppose we fix j s+ i and proceed optimally from there. If F(j s+ i, s + 1) is known for all j s+ \ > j 
then we can compute (^) in standard dynamic programming fashion as: 

F(j, s) = max pjT js Aj + F(j s+1 ,s + 1) (3) 

Js+l>3 

We can start the DP algorithm by setting s) = V j = n + 1, . . . , n + m, s = 1, . . . , m. 
Now recurse backwards and compute F(j, s) V s = 1, . . . , m and j = n, . . . , 1. Finally, choose 
maxj F(j, 1) to get the solution of ([!]) as well as the optimal slate. 

3. An Optimal Path Approach 

Frequently, problems that are amenable to dynamic programming can be cast in the form of a 
shortest or longest path problem [Q], and this is no exception. Let us define a network with nodes 
Njp for p = 1, 77t and j = p, n + 1. We also define terminal nodes iV^o and iV n+ i ?n+ i. The 
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3 Slots and 5 Ads (n=5, m=3) 




Figure 1 : Network with n > m 



directed edges and their associated costs are defined as: 

(iVo,o,iVj,i) : c j,o = 

(j = l,...,n + l) 

(j > i > p = 1, m — 1) 

(j =p,...,n- 1) 

(■^n,m!^n+l,m+l) • Cn,n+l,m — Pn^^nm 

(N n +i, P , N n+ i tP+ i) : c n+ i tn+ i : p = 

(p = 1, ...,m) 

where c$ j p is the cost for the edge directed from .ZV^p to Nj tP+ i and e is the minimum bid as before. 
Note that not all these edges are defined (or need to be defined) if n < m. (See figures 1 and 2). 

Since the network is directed and acyclic, and the utilities/distances associated with the non- 
trivial edges correspond to the utility of placing ad i in slot p, followed by ad j in slot p + 1, it is 
easy to see that the longest path from iV^o to N n+ i tTn+ i maximizes ([IJ) - or equivalently the shortest 
path using the negatives of the costs defined above. 
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3 Slots and 2 Ads (n=2, m=3) 




Figure 2: Network with n < m 



Very efficient algorithms are known for the shortest path problem, but in our case the problem 
is small, so a simpler implementation suffices. 

Since the forward recursion/ optimal path approach is more intuitive and visually appealing, we 
shall use it for the remainder of this paper in the discussion of extensions and variations. 

4. Extension to Revenue Ranking 

The present scheme extends to what is sometimes known as revenue ranking, where the ads are 
ranked not just by bid, but by "expected revenue" which is modeled as the product of the advertiser's 
bid Aj and Qj, which is the "quality score", or "clickability", for bidder fs ad for this query. This 
quantity is thought to better represent the value of a bidded ad than the raw bid. 

The ads are now ranked according to this product, so that: 

A1Q1 > A 2 Q 2 ... > A n Q n . 

To preserve the condition that the expected payment for a click is at least that of the next ranked 
bidder, we require that the expected cost per click (CPC) of bidder j p must be at least Aj p+1 . 
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Using this observation, the path technique extends to the expected revenue ranking scheme. In this 
case the objective function is now: 

m O 

Maximize U = £ Pjp A jp+1 ^T jpP (4) 

p=l 

where The network model above remains the same except that the edge costs are now modified to 
be: 

{N ofi ,N jtl ) : c 0)j ,o = 

(j = l,...,n + l) 
{Ni jP , Nj jP+ i) : Cij :P = piAj-^-Tip 

{j > i >P = 1, ...,m - 1) 

{Nj,mi -^n+l,m+l) '■ c j,n+l,m = PjAj-\-\ q. Tjp 

(j =P, -,n- 1) 

{N n+ i tP ,N n+ i :P+ i) : c n+ i in+ i iP = 

(p = 1, ...,m) 



5. Further Extensions 

The path technique extends to other practically useful variants. Two of these include introducing 
restrictions on the subset of ads which may be omitted from the slate, and use of a hybrid objective 
function which is made up of a weighted sum of the first and second price utilities. 



5.1. Restricted Omissions 

The algorithm(s) we have been considering assume that any appropriate ordered subset of the 
ads may be chosen which fits within the slate size. In practice this may not always be true. While it 
is obviously legitimate to exclude ads from the limited space when the bids (or expected revenues) 
are too low, it is not so obvious that this may be done for other reasons. For example, one of the 
motivations we cited for using non-unit weights pj was the need to accommodate bidders with lim- 
ited budgets. One means of doing this is to allow budgeted bidders to be held out of the notional 
auction — that is excluded from the slate. However, it is not obvious that this option should extend 
to unbudgeted bidders. This must be a business decision. We therefore require a means of speci- 
fying which ads (bidders) can be excluded for reasons other than low rank. We accomplish this by 
specifying a mask or bit vector, which has a 1 if the ad can be excluded and a zero otherwise. We 
then modify the algorithm as follows: 

Since each arc in the network gives the utility of including a particular ad i in position p followed 
by ad j, we consider only arcs such that: 

1. For each position p we allow i to assume the values from p up to the first ad in rank order 
which has a zero mask bit. Any subsequent ads are ignored for this p. This ensures that 
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3 Slots and 5 Ads (n=5, m=3) 




Figure 3: Reduced network with mask = (1, 0, 1, 0, 1) 



the unmasked ad with the highest rank is not excluded, but that lower ranked ads which are 
masked are not considered for the position p. 

2. For each i chosen as above, the second index j shall only run from i + 1 through the next 
unmasked ad. This ensures that if an ad i can be followed by an unmasked ad it will be the 
next in rank order. 

This scheme may be thought of as actually removing arcs from the networks such as those shown in 
the figures, or more simply implemented by modifying the longest path algorithm with the rules we 
have itemized. In Figure ^| we show the reduced network obtained from Figure [I] when we specify 
that mask = (1,0, 1,0, 1). In practice however, we use the second technique of modifying the 
algorithm. 

5.2. Hybrid Objectives 

Thus far we have assumed some form of generalized second price auctions is implicit in the 
utility of the slate. However, even if the price per click is computed with such an assumption, we 
may wish to include other factors in our utility calculation. For example we may wish to consider 
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the first prices, on the assumption that in a truthful setting these are the actual values placed by 
the bidders on a click for their ad and we wish to take this into account. Alternatively we may be 
interested in the raw number of expected clicks. Both of these situations can be accommodated by 
considering a composite weighted objective function which takes into account both first and second 
prices in specifying the arc costs in the longest path algorithm. Let us define the hybrid objective 
as: 

m m q. 

Maximize U = ^ Pj P A j P T j pP + J2 Pj P A i P +i -j^~ T 0vV ( 5 ) 
P =i P =i ^ip 

which may be re-written: 

m o 

Maximize U = $> 3p A,- p + Pjp A jp+1 ^)T jpP (6) 
p=l 

Then if we rewrite the relevant arc costs as: 

Q,j',p — [lM, A i Pi A j 

(J > i >P = 1, -,m - 1) 

c j,n+l,m = (Pj A j + PjAj + i )Tj p 

(j =p,...,n- 1) 

we may solve the optimal path problem as before. 

This framework is very flexible and can lead to many extensions. For example if we wish to 
have a weighted combination of expected revenue and expected clicks we may set the fjLj to 1/Aj. 

Our colleague Zoe Abrams has also informed usj[T]] that the dynamic programming approach to 
column generation extends to the Vickery-Clark-Groves (VCG) auction mechanism. 

6. Computational Results 

All of the algorithms we have described are efficient in terms of number of operations, especially 
since the numbers involved are relatively small in the on-line advertising framework. Typically m 
is less than or equal to about 12, and the number of bidders n which need to be considered for 
inclusion in a slate is less than 100, and may even be less than m. 

We have implemented the forward algorithm in a straightforward way in C++, to be called as a 
subroutine in the column generation algorithm of [§]. When run on a 32-bit Linux box with a 2.8 
GHz Xeon processor, it takes an average of 25 microseconds for a sample of 5000 queries where 
there are 12 ad positions, and between 1 and 77 candidate ads. This includes setting up the data 
structures as well as the actual path computation, and is obtained without any attempt to optimize 
the code other than using the -02 option of the gcc compiler. We can therefore afford to execute 
the algorithm repetitively, either in real time in an on-line setting, or repeated many times as a 
subroutine. 
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7. Conclusion 

Sponsored search auctions have recently received considerable attention, but the subsequent 
problem of how to implement, or adapt, the outcomes in the presence of complicating factors such 
as budget constraints appears to have been less well studied. We have shown that this can be ac- 
complished in cases of practical interest by a simple but very fast dynamic programming algorithm. 
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Appendix 

The algorithm(s) described in this paper were developed as a column generating subroutine for 
the linear programming model (LP) described in [||] for optimizing sponsored search ad delivery 
subject to budget constraints. The concept of a "slate" is put forward in that paper corresponding to 
columns of the LP. We may formally state the LP as follows: 
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Indices 



1 

I = 1, 


N The queries 


J — 1 


, . . . , i vi i ne uiaaei s 


= 1 


, ...,Ki The slates (for query i) 


Data 




d i 


The total budget of bidder j 


Vi 


Expected number of occurrences of 




keyword i 


O'ijk 


Expected cost to bidder j if slate k 




is shown for keyword i 




Objective function coefficient for slate k 




for keyword i 



Variables 

Xik Number of times to show slate k for keyword i 
Constraints 
(Budget) 

Y a ijkXik < dj Vj (7) 
i k 

(Inventory) 

Y X ik < v i V * (8) 
fc 

Objective 
Maximize J2i Efc r ifc^ifc 

Each column of the LP, that is the values, corresponds to the expected cost per click (CPC) 
to budgeted advertisers if their ad is clicked on when slate k is shown for query i. If there are more 
then a handful of budgeted bidders for a query, the number of possible slates is enormous. We 
therefore require a method for generating those columns which may be included in the LP optimum 
solution (the well-known idea of "column generation" [||]). When the objective function coefficients 
are the expected revenue from a slate (i.e. = J2j a ijk), and the dual values corresponding to the 
budget constraints are itj, the subproblem we wish to solve is of the form (|J) with pj = 1 — iij. If 
the objective coefficients are the bid values (assumed to reflect a bidders true value for a click), the 
subproblem is of the form (|5|) with /Xj = 1 and pj = —iTj. In either case, the slate generated can 
potentially improve the LP solution if the value is greater than the dual value (say ji) for the i th 
query volume constraint . See [Q] for much greater detail. 
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